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© Albumin-based nucleotides, their replication and use, and plasmids for use therein. 

© The DNA sequence coding for human serum albumin 
has been isolated and inserted as two fragments into two 
novel plBsmids which can be replicated In £. coli These 
novel fragments can be joined to provide a unitary DNA 
sequence which then can be cloned into a suitable host, e.g. 
£. coli t for the expression of human serum albumin (which is 
used extensively in medical practice in treating shock 
conditions). 
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ALBUMIN-BASED NUCLEOTIDES, THEIR REPLICATION 
AND USE, AND PLASMIDS FOR USE THEREIN 

This invention relates to nucleotides related to 
human serum albumin (HSA) , their replication and use, 
and plasmids (and host substances) for use therein « 
The gene for serum albumin is regulated in 
5 development 0 On the other hand, serum albumin is synthe- 
sised in mammals by the adult liver, and its plateau in 
adulthood o The embryonic liver and yolk sac, on the 
other hand, produce predominantly a- fetoprotein, but the 
synthesis decreases drastically after birtho Recently, 
10 Law et al determined the complete sequence of mouse 
a-fetoprotein mRNA, Nature 291 (1981) 201-205 0 The 
structure revealed extensive homology to mammalian serum 
albumin, indicating that the two proteins are encoded 
in the* same gene family <> Similar conclusions have been 
15 reached from studies on the a-fetoprotein genes of the 
rat and the mouse; see Jagodzinski et al* Proc Natlo 
Acad 0 Scio USA, 78 (1981) 3521-3525, and Gorin et al, 
Jo Biol. Chemo 256 (1981) 1954-1959. 

The complete nucleotide sequence of human serum 
20 mRNA has been determined from recombinant cDNA clones and 
from a primer-extended cDNA synthesis on the mRNA 
template. The sequence comprises 2,078 nucleotides, 
starting upstream of a potential ribosome binding site 
in the 5 8 -untranslated region* It contains all the 
25 translated codons and extends into the poly (A) at the 

3 fl -terminus o Part of the translated sequence codes for a 
hydrophobic prepeptide met-lys-trp-val-thr-phe-ile-ser- 
leu-leu-phe-leu-phe-ser-ser-ala-tyr-ser, followed by a 
basic propeptide arg-gly-val-phe-arg-arg 0 These signal 
30 peptides are absent from mature serum albumin and, so 
far, have not been identified in their nascent state in 
humans o A remaining 1,755 nucleotides of the translated 
mRNA sequence code for 585 amino acids which are in 
agreement, with few exceptions , with the published amino 
35 acid data for human serum albumin * The mRNA sequence 

verifies and refines the repeating homology in the triple- 
domain structure of the serum albumin molecule D 
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Human serum albumin cDNA is cloned into the PstI site of plasmid 
pBR322 by the oligo(dG)-oligo(dC) tailing technique 8 Plasmid DNA was 
isolated from 97 positive colonies which hybridized to the enriched 
albumin cDNA probe, and the recombinant plasmid pHA36 was found to 
contain the largest insert of an albumin cDNA sequence- Its restric- 
tion endonuclease map is shown in the drawing, together with a re- 
striction map of the primer-extended plasmid clone pHA206o The latter 
was obtained in a second transformation experiment after initiating 
the cDNA synthesis from an internal primer* This primer was a 91 base 
pairs long DNA fragment, Mspl(152)-Taql (182/3), isolated from pHA36o 
The two plasmids, pHA36 and pHA206 D share 0ol5 kb of homologous DNAo 
Together , they encode the entire sequence for human serum albumin, 
starting with the CTT codon for leu -10 of the prepeptide and extend- 
15 ing into the 3'-untranslated region of poly(A)o 

Sequence of the Albumin cDNA o The sequence was determined for the 
most part on both DNA strands to ensure accuracy o All of the restric- 
tion sites used to end-label DNA fragments were sequenced across by 
20 labeling a neighboring restriction site<> The entire nucleotide 
sequence of the serum albumin mRNA, as determined from the cloned DNA 
in pHA36 0 pHA206 o and from the primer-extended cDNA at the 5' -terminus 
of the mess age , is shown in the following Table 1* The inferred amino 
acid sequence is also Indicated*, The mRNA length is 2,078 nucleo- 
tides, of which 38 represent the 5' -untranslated region, 54 identify a 
prepeptide of 18 amino acids, 18 identify a propeptide of 6 amino 
acids, 1 5 755 code for the known 585 amino acids of serum albumin, 189 
make up the 3' -untranslated region and 24 are the poly(A) sequence- 
Nucleotides 5 to 15 (-34 to -24) in the 5' -untranslated region (Table 
1) are compl ementary to a 3' -terminal region of eukaryotic 18S RNA 
[Azad, AoAo and Deacon, NoJo (1980) Nucl» Acids Res„ 8, 4365-43761 and 
thus could represent a ribosome binding site: 

(S'JoooT T°T C T T C T G T.o, 0000000 .albumin mRNA 

35 (3')...G AGGAAGGCGUCCmgA m^A 18S RNA 
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The translated portion of the mRNA sequence codes for the signal 
peptide and the main body of the albumin polypeptide chain. The 
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signal peptide is composed of a hydrophobic prepeptide of 18 amino 
acids and a basic propeptide of 6 amino acids (Table 1)„ Since pre- 
peptides are removed from nascent secretory proteins (like albumin) in 
the endoplasmic reticulum, they are seen only in vitro 1n heterologous 
5 translation systems. As yet, they have not been found within cells 
[Judah, JoOo and Quinn, P*S* (1977) FEBS 11th Mtgo . Copenhagen 50, 
21-29; and Strauss, AoWo, Oonohue, A<,M„, Bennett , CDo, Rodkey, J.A 0 
and Alberts, A.W e (1977) ProCo Natl. Acado Scio USA 74, 1358-13621. 
This is the first report of the presence and the sequence of a pre- 
10 peptide for human serum albumino As it is with other secretory pro- 
teins, the conversion of proalbumin to albumin takes place in the 
Golgi vesicles, and the enzyme responsible for this cleavage is 
probably cathepsin B [Judah, JoDo and Quinn, PoS„ (1978) Mature 271, 
384-385] e This is also a first report on the sequence of the pro- 
15 peptide for normal human serum albumino 

At the 3'-end of the message, the putative polyadenylation signal 
sequence, AATAAA, is located 164 nucleotides downstream from the amino 
acid termination codon TAA and 16 nucleotides upstream from the 
beginning of the poly(A) sequence*. Another characteristic sequence 
20 located near the polyadenylation site has been identified by Renoist, 
et alo [Benoist, C, O'Hare, K. 9 Breathnach, R* and Chambon, P„ (1980) 
Nuclo Acids Res* 8, 127-142]; the concensus sequence from several 
mRNAs was concluded as TTTTCACTGCo A similar sequence, TTTTCTCTGT, is 
located 19 nucleotides upstream from the AATAAA hexanucleotide in the 
25 human albumin mRNA {Table 1)° 
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TABLE 1 
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Following are examples which illustrate procedures tf including tho 
■best modow for practicing the invention* These examples should not be 
construed as limiting. All percentages are by weight and all solvent 
mixture proportions are by volume unless otherwise noted. 
Example 1 Isolation of Messenger RNA 

Human liver mRNA was obtained following the procedure of 
Chirgwin, et al [Chirgwin, J.M. 0 Przybyla, A.E. , MacDonald, R.J. and 
Rutter, W.J. (1979) Biochemistry 18, 5294-5299]. Immunoprecipitation 
of albumin containing polysomes was performed according to Taylor and 
Tse [Taylor, J.M. and Tse, T.P.H. (1976) J. Biol. Chem. 251, 7461- 
7467]. In vitro translation of mRNA was carried out in a reticulocyte 
cell-free system, following the instruction of the manufacturer (New 
England Nuclear). The translation products were separated electro- 
phoretically according to Laemmli [Laemmli, J.K. (1970) Nature 227, 
680-685. 

Example 2 Cloning Procedures 

Double stranded cDNA was synthesized as described previously 
[Law, S.„ Tamaoki, T., Kreuzaler, F. and Dugaiczyk, A. (1980) Gene 10, 
53-61]. It was annealed to PstI -linearized pBR322 DNA [Bolivar, F. , 
Rodriguez, R.L., Greene, P.J., Betlach, M.C., Heyneker, H.L., Boyer, 
H.W., Crossa, J.H. and Falkow, So (1977) Gene 2, 95-113] that had been 
tailed with 15 dG residues/3'-terminus [Dugaiczyk, A., Robberson, D.L. 
and Ullrich, A. (1980) Biochemistry 19, 5869-5873]. The annealed DNA 
was used to transform E. coli strain RR1, as detailed previously [Law, 
S. , et al., Ibid. ]. The albumin clones were selected using the colony 
hybridization method of Grunstein and Hogness [Grunstein, M. and 
Hogness, D.S. (1975) Proc. Natl. Acad. Sci. USA 72, 3961-3965], with 
[^ p ]-labeled cDNA synthesized with the immunoprecipitated polysomal 
mRNA as template. 

As shown in Example 5, plasmids pHA36 and pHA206 were deposited 
in Eo_ coli HB101 hosts. The plasmids were obtained from E. coli RR1 
hosts, described in this example, and transformed into E. coli HB101 
by standard procedures well known to those of ordinary skill in this 
art. The E^_ coli RR1 hosts were lysed and then centrifuged to 
separate the chromosomal DNA, cell DNA and plasmid DNA. The plasmid 
DNA, remaining in the supernatant, is precipitated with ethanol and 
the precipitate is resuspended in buffer, e.g., TCM (10mM Tris*HCl, pH 
8.0, 10 mM CaCl2i 10 mM MgCl 2 )o The cells for transformation are 
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prepared as follows: 120 ml of L-broth (1% tryptone, 0,5% yeast 
'extract^ 0o5S NaCl) are inoculated with an 18 hour culture of HB101 
NRRL B-11371 and grown to an optical density of 0,6 at 600 nm„ Cells 
are washed in cold 100 mM NaCl and resuspended for 15 minutes in 20 ml 
5 chilled 50 nM CaCl2° Bacteria are then concentrated to one-tenth of 
this volume in CaCl2 and mixed 2:1 (v:v) with annealed plasmid DNA, 
prepared as described above. After chilling the cell-ONA mixture for 
15 minutes, it is heat shocked at 42°C for 2 minutes, then allowed to 
equilibrate at room temperature for ten minutes before addition of 
10 L-broth 10 times the volwne of the cell-ONA suspension.. Transformed 
cells are incubated in broth at 37°C for one hour before inoculating 
selective media (L-agar plus 10 yg/ml tetracycline) with 200 pl/plate* 
Plates are incubated at 37°C for 48 hours to allow the growth of 
transformantSo 

15 Example 3 Mapping of Restriction Endonuclease Sites 

Restriction endonucl eases were obtained from Bethesda Research 
Laboratories and New England Biolabs and were used according to the 
manufacturers' instruct!" onSo The digested DNA fragments were analyzed 
electrophoretically on agarose [Helling, RoBo, Goodman, H.Mo and 

20 Boyer, HoWo (1974) Jo Virol , 14, 1235-1244] or acryl amide [Dingman, 
Co, Fisher, MoPo and Kakefuda, To (1972) Biochemistry 11, 1242-12501 
gel So 

Example 4 DNA Sequencing 

DNA fragments were dephosphoryl ated with bacterial alkaline 

25 phosphatase (Worthington) and labeled at the S'-ends with poly- 
nucleotide kinase (Boehringer-Mannheim) and Following 
digestion with a second restriction endonuclease and electrophoretic 
separation of the fragments, DNA sequence determination was done 
according to the procedure of Maxam and Gilbert [Maxam, Ao and 

30 Gilbert, W* (1980) Methods Enzym, 65, 499-560] and the degradation 
products were separated electrophoretically on 0*4 mm acrylamide gels 
as described by Sanger and Coulson [Sanger, F« and Coulson, R« (1978) 
FEBS Letters 87 0 107-110]o 

Example 5 Recombinant Plasmids pHA36 and pHA206 
35 As disclosed in Example 2, albumin clones were selected by 

hybridizing to the enriched albumin cDNA probe* Plasmid pHA36 con- 
tained the largest insert of an albumin cDNA sequenceo Both plasmids 
pHA36 and pHA206 have been deposited in a viable Eo coli host in the 
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permanent collection of the Northern Regional Research Laboratory 
(NRRL), UoSo Department of Agriculture, Peoria, Illino1s D UoSoAo 
Their accession numbers in this repository are as follows: 
HB101(pHA36) - NRRL B-12551 
5 HB101(pHA206) - NRRL B-12550 

Eo coli HB101 is a known and widely available host microbe* Its 
NRRL accession number is NRRL B-11371o 

NRRL B-12550 and NRRL B-12551 are available to the public. vpe* 
the grant of a patent » — It should bo understood that tho availability 
10 of these deposits does not constitute a license to practice the sub- 
ject invention in derogation of patent rights granted with the subject 

in st r u men t by gov e«3Bettfca3— a ct i A ft ^ -- 

Eo coli RR1 and E» coli HB101 are known and widely available host 
microbeso Their NRRL accession numbers are NRRL B-12186 and NRRL 
15 B-11371, respectivelyo 

pBR322 is a well known and widely available plasmido It can be 
obtained from the following host deposit by standard procedures: 
NRRL B-12014 - Eo coli RR1 (pBR322)o 
YEp6 is a well known and widely available yeast episomal plasmido 
20 It can be obtained from the following host deposit by standard 
procedures : 

Eo coli HB101 (YEp6) - NRRL B-12093* 
Example 6 Assembly of the Serum Albumin Gene 

Assembling the pieces together is a straighforward task of re- 
25 striction enzymologyo There is only one Mspl site in the overlapping 
DNA sequence of the two cDNA cloneSo Two enzymatic steps of (i) Wspl 
digestion of the two DNAs, followed by (ii) the use of ligase, an 
enzyme that seals DNA fragments, will give the desired product o 
Although two other undesired DNA species will also be obtained in the 
30 course of this recombination reaction, both of them will differ sub- 
stantially in sizeo Thus, separation and isolation of the desired DNA 
species will be achieved- 

The assembled DNA clone can be used to transform two types of 
cells: 

35 (a) Escherichia coli 

(b) Saccharomyces cerevisiae 



(a) The vector of choice is plasmid pBR322 9 the same that has 
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been successfully used for cloning of the two fragmented pieces of the 
serum albumin cDNAo 

(b) In order to transform yeast with the serum albumin 
structural gene sequence , the DNA must be inserted into one of the 
5 existing yeast plasmid vectorSo This can be accomplished by taking 
advantage of the fact that several restriction endonuclease recog- 
nition sequences are absent from the cloned serum albumin DNA„ Syn- 
thetic EcoRl DNA linkers can be ligated to the DNA fragment containing 
the serum albumin sequence followed by insertion (ligation) into one 

10 of the yeast plasmid vectors* e.g„ 9 YEp6, at the Eco Rl cloning site. 
The fused chimeric plasmid can be used to transform yeast according to 
an established procedure [Hinnen, Ao, Hicks, JoRo and Fink, G«Ro 
(1978) Proco Natlo Acado Scio USA , 75, 1929IU YEp6 can be obtained 
from the NRRL repository , as disclosed supra » 

15 Example 7 Expression of the Serum Albumin Gene 

The main body of the structural gene will be transcribed by the 
Eq coli or yeast enzymes*, If little or no albumin is produced with 
the selected host, then an Escherichia coli promoter DNA sequence 
carrying an initiation codon, 1 Q eo, ATG, can be ligated at the begin- 

20 "ing of the serum albumin structural gene- Such elements are known 
and available, e-go, lac promoter used for the expression of human 
interferon gene in E, coli [Proco Natlo Acado Sci\ 77, 5230 (1980)]; 
source of promoter DNA [ProCo Natlo Acado Scio 76, 760 (1979)]o Also, 
see Nature, Volo 281, October 18, 1979o It has already been 

25 documented that such Escherichia coli promoter sequences function well 
in the expression of foreign genes in Escherichia coli [Mercereau- 
Puijalon, 0«, Royal, Ao, Cami, Bo, Garapin, Ao, Krust, A 0 , Gannon, lo 
and Kourilsky, P» (1978) Nature 275, 505; and Goeddel, DoVo, Kleid, 
DoGo, Bolivar, Fo„ Heyneker, HoU, Yansura, D«Go, Grea, R«, Hirose, 

30 To, Kraszewski, A», Itakura, K., and Riggs, Ao (1979) Natl, Acado Scio 
USA 76, 106]o For expression in yeast, see Rose, Mo, Casadaban, W 0 J 0 
and Botstein, Do (1981) ProCo Natlo Acado Scio USA 78, 2460 and 4466o 
Example 8 Screening of Clones Producing Albumin 

Immunological methods can be used to detect small amounts of 

35 albumin made in a bacteriumo Flat disks of flexible polyvinyl are 
coated with the IgG fraction from an iircnune serum and the disks are 
pressed onto an agar plate so that antigen released from an situ 
lysed microbial colony can bind to the fixed ant i body o The plastic 
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disk 1s then incubated with the same total IgG fraction labeled with 
radioactive iodine so that other determinants on the bound antigen can 
in turn bind the iodinated antibody., Radioactive areas on the disk 
expose X-ray film during autoradiography and thus identify colonies 
producing the protein which is being screened for e Detailed protocols 
of this procedure have been published [Broome, So and Gilbert, W„ 
(1978) Proc. Natlo Acad. Sci» USA , 75, 2746]o The purification of 
human serum albumin can be accomplished by using procedures well known 
in the art* For example, procedures disclosed in a chapter by To 
Peters: Purification and Properties of Serum Albumin, in: The Plasma 
Proteins, Putnam, Edo Academic Press, New York, 1975, can be usedo 

The work described herein was all done in conformity with 
physical and biological containment requirements specified in the NIH 
Guide! ineso 
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1. Plasmid pHA36, having a restriction endonuclease pattern as 
shown in the drawing. 

5 

2. Plasmid pHA206, having a restriction endonuclease pattern as 
shown in the drawing. 

3. coli HB101 (pHA36) having the deposit accession number 
10 NRRL B-12551. 

4. coli HB101 (pHA206) having the deposit accession number 
NRRL B-12550. 

15 5. A microorganism modified to contain a nucleotide sequence 

coding for the amino acid sequence of human serum albumin; said 
nucleotide sequence is as follows: 
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6. Nucleotide sequence of the cDNA of human serum albumin, said 
nucleotide sequence is as follows: 
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„ihJ" S "' uence codi "9 for the prepeptlde of human serum 

albumin, said nucleotide sequence 1s as follows: 
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8. Nucleotide sequence coding for pro human serum albumin, said 
nucleotide sequence 1s as follows: 
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9. Nucleotide sequence coding for the pre pro human serum 
albumin, said nucleotide sequence is as follows: 
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10o A nucleotide sequence according to any of claims 
6 to 9, in essentially pure fonru 

11 o A DNA transfer vector comprising a nucleotide 
sequence as defined in claim 5* 
5 12 o A DNA transfer vector according to claim 11 , 
transferred to and replicated in a micro-organism* 
13 o A DNA transfer vector according to claim 12 , 
which is a plasmido 

14 o A DNA transfer vector according to claim 13, 
10 wherein the plasmid is pBR322 or YEp6 Q 

15 o A process for preparing human serum albumin, 
which comprises culturing a micro-organism according 
to claim 5» 

16 o A DNA transfer vector according to any of 
15 claims 12 to 14, or a process according to claim 15, 
wherein the micro-organism is a bacterium or yeast <, 
17 o A vector or process according to claim 16, 
wherein the bacterium or yeast is coli or Saccharomyces 
cerevisiae» 
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